Fix Qwen3 recipe and update autoquant example cmd#749
Conversation
Signed-off-by: weimingc <17592131+meenchen@users.noreply.github.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #749 +/- ##
=======================================
Coverage 74.65% 74.66%
=======================================
Files 192 192
Lines 18969 18975 +6
=======================================
+ Hits 14162 14167 +5
- Misses 4807 4808 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| if model_type in ["qwen3moe", "qwen3next"] and qformat == "nvfp4": | ||
| # Disable the attention projection layers to retain accuracy | ||
| quant_cfg["quant_cfg"]["model*.*attn*in_proj*"] = {"enable": False} | ||
| quant_cfg["quant_cfg"]["model*.*attn*q_proj*"] = {"enable": False} | ||
| quant_cfg["quant_cfg"]["model*.*attn*k_proj*"] = {"enable": False} | ||
| quant_cfg["quant_cfg"]["model*.*attn*v_proj*"] = {"enable": False} |
There was a problem hiding this comment.
Is there an option to skip this setting? We are hardcoding skipping of attention here.
Cc @shengliangxu - config system and model based config examples could be helpful to improve the overall experience.
There was a problem hiding this comment.
Sure. Right now feel free to add an additional flag for auto quant.
There was a problem hiding this comment.
We can refactor this part once the config system is ready.
## What does this PR do? **Type of change:** Bug fix <!-- Use one of the following: Bug fix, new feature, new example, new tests, documentation. --> **Overview:** ? ## Usage <!-- You can potentially add a usage example below. --> ```python # Add a code snippet demonstrating how to use this ``` ## Testing <!-- Mention how have you tested your change if applicable. --> ## Before your PR is "*Ready for review*" <!-- If you haven't finished some of the above items you can still open `Draft` PR. --> - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No <!--- If No, explain why. --> - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No <!--- Only for new features, API changes, critical bug fixes or bw breaking changes. --> ## Additional Information <!-- E.g. related issue. --> Signed-off-by: weimingc <17592131+meenchen@users.noreply.github.com>
## What does this PR do? **Type of change:** Bug fix <!-- Use one of the following: Bug fix, new feature, new example, new tests, documentation. --> **Overview:** ? ## Usage <!-- You can potentially add a usage example below. --> ```python # Add a code snippet demonstrating how to use this ``` ## Testing <!-- Mention how have you tested your change if applicable. --> ## Before your PR is "*Ready for review*" <!-- If you haven't finished some of the above items you can still open `Draft` PR. --> - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No <!--- If No, explain why. --> - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No <!--- Only for new features, API changes, critical bug fixes or bw breaking changes. --> ## Additional Information <!-- E.g. related issue. --> Signed-off-by: weimingc <17592131+meenchen@users.noreply.github.com>
What does this PR do?
Type of change: Bug fix
Overview: ?
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Additional Information